Overview

Dataset statistics

Number of variables19
Number of observations1070
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory159.0 KiB
Average record size in memory152.1 B

Variable types

Numeric13
Categorical5
Boolean1

Alerts

WeekofPurchase is highly overall correlated with PriceCH and 9 other fieldsHigh correlation
PriceCH is highly overall correlated with WeekofPurchase and 13 other fieldsHigh correlation
PriceMM is highly overall correlated with WeekofPurchase and 12 other fieldsHigh correlation
DiscCH is highly overall correlated with WeekofPurchase and 8 other fieldsHigh correlation
DiscMM is highly overall correlated with WeekofPurchase and 8 other fieldsHigh correlation
LoyalCH is highly overall correlated with Id and 3 other fieldsHigh correlation
SalePriceMM is highly overall correlated with WeekofPurchase and 12 other fieldsHigh correlation
SalePriceCH is highly overall correlated with WeekofPurchase and 13 other fieldsHigh correlation
PriceDiff is highly overall correlated with WeekofPurchase and 11 other fieldsHigh correlation
PctDiscMM is highly overall correlated with WeekofPurchase and 11 other fieldsHigh correlation
PctDiscCH is highly overall correlated with WeekofPurchase and 8 other fieldsHigh correlation
ListPriceDiff is highly overall correlated with WeekofPurchase and 11 other fieldsHigh correlation
Purchase is highly overall correlated with LoyalCHHigh correlation
StoreID is highly overall correlated with Id and 9 other fieldsHigh correlation
SpecialCH is highly overall correlated with DiscCH and 5 other fieldsHigh correlation
SpecialMM is highly overall correlated with PriceCH and 4 other fieldsHigh correlation
Store7 is highly overall correlated with StoreID and 9 other fieldsHigh correlation
STORE is highly overall correlated with Id and 9 other fieldsHigh correlation
Id is highly overall correlated with StoreID and 2 other fieldsHigh correlation
Id is uniformly distributedUniform
Id has unique valuesUnique
DiscCH has 838 (78.3%) zerosZeros
DiscMM has 746 (69.7%) zerosZeros
PriceDiff has 75 (7.0%) zerosZeros
PctDiscMM has 746 (69.7%) zerosZeros
PctDiscCH has 838 (78.3%) zerosZeros
ListPriceDiff has 119 (11.1%) zerosZeros

Reproduction

Analysis started2022-11-24 14:53:36.198993
Analysis finished2022-11-24 14:54:09.143449
Duration32.94 seconds
Software versionpandas-profiling vv3.5.0
Download configurationconfig.json

Variables

Id
Real number (ℝ)

HIGH CORRELATION
UNIFORM
UNIQUE

Distinct1070
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean535.5
Minimum1
Maximum1070
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size8.5 KiB
2022-11-24T19:54:09.318975image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile54.45
Q1268.25
median535.5
Q3802.75
95-th percentile1016.55
Maximum1070
Range1069
Interquartile range (IQR)534.5

Descriptive statistics

Standard deviation309.0267
Coefficient of variation (CV)0.57708067
Kurtosis-1.2
Mean535.5
Median Absolute Deviation (MAD)267.5
Skewness0
Sum572985
Variance95497.5
MonotonicityStrictly increasing
2022-11-24T19:54:09.488756image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
0.1%
719 1
 
0.1%
705 1
 
0.1%
706 1
 
0.1%
707 1
 
0.1%
708 1
 
0.1%
709 1
 
0.1%
710 1
 
0.1%
711 1
 
0.1%
712 1
 
0.1%
Other values (1060) 1060
99.1%
ValueCountFrequency (%)
1 1
0.1%
2 1
0.1%
3 1
0.1%
4 1
0.1%
5 1
0.1%
6 1
0.1%
7 1
0.1%
8 1
0.1%
9 1
0.1%
10 1
0.1%
ValueCountFrequency (%)
1070 1
0.1%
1069 1
0.1%
1068 1
0.1%
1067 1
0.1%
1066 1
0.1%
1065 1
0.1%
1064 1
0.1%
1063 1
0.1%
1062 1
0.1%
1061 1
0.1%

Purchase
Categorical

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size8.5 KiB
CH
653 
MM
417 

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters2140
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowCH
2nd rowCH
3rd rowCH
4th rowMM
5th rowCH

Common Values

ValueCountFrequency (%)
CH 653
61.0%
MM 417
39.0%

Length

2022-11-24T19:54:09.628363image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2022-11-24T19:54:09.760039image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
ch 653
61.0%
mm 417
39.0%

Most occurring characters

ValueCountFrequency (%)
M 834
39.0%
C 653
30.5%
H 653
30.5%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 2140
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
M 834
39.0%
C 653
30.5%
H 653
30.5%

Most occurring scripts

ValueCountFrequency (%)
Latin 2140
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
M 834
39.0%
C 653
30.5%
H 653
30.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2140
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
M 834
39.0%
C 653
30.5%
H 653
30.5%

WeekofPurchase
Real number (ℝ)

Distinct52
Distinct (%)4.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean254.38131
Minimum227
Maximum278
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size8.5 KiB
2022-11-24T19:54:09.896628image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Quantile statistics

Minimum227
5-th percentile229
Q1240
median257
Q3268
95-th percentile276
Maximum278
Range51
Interquartile range (IQR)28

Descriptive statistics

Standard deviation15.558286
Coefficient of variation (CV)0.061161279
Kurtosis-1.2749267
Mean254.38131
Median Absolute Deviation (MAD)14
Skewness-0.21098982
Sum272188
Variance242.06027
MonotonicityNot monotonic
2022-11-24T19:54:10.073157image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
274 39
 
3.6%
275 33
 
3.1%
269 32
 
3.0%
259 31
 
2.9%
260 29
 
2.7%
233 28
 
2.6%
267 26
 
2.4%
266 26
 
2.4%
256 26
 
2.4%
265 26
 
2.4%
Other values (42) 774
72.3%
ValueCountFrequency (%)
227 15
1.4%
228 22
2.1%
229 23
2.1%
230 19
1.8%
231 20
1.9%
232 19
1.8%
233 28
2.6%
234 18
1.7%
235 16
1.5%
236 23
2.1%
ValueCountFrequency (%)
278 18
1.7%
277 25
2.3%
276 26
2.4%
275 33
3.1%
274 39
3.6%
273 19
1.8%
272 25
2.3%
271 20
1.9%
270 23
2.1%
269 32
3.0%

StoreID
Categorical

Distinct5
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size8.5 KiB
7
356 
2
222 
3
196 
1
157 
4
139 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters1070
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row7

Common Values

ValueCountFrequency (%)
7 356
33.3%
2 222
20.7%
3 196
18.3%
1 157
14.7%
4 139
 
13.0%

Length

2022-11-24T19:54:10.219764image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2022-11-24T19:54:10.362424image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
7 356
33.3%
2 222
20.7%
3 196
18.3%
1 157
14.7%
4 139
 
13.0%

Most occurring characters

ValueCountFrequency (%)
7 356
33.3%
2 222
20.7%
3 196
18.3%
1 157
14.7%
4 139
 
13.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1070
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
7 356
33.3%
2 222
20.7%
3 196
18.3%
1 157
14.7%
4 139
 
13.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1070
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
7 356
33.3%
2 222
20.7%
3 196
18.3%
1 157
14.7%
4 139
 
13.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1070
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
7 356
33.3%
2 222
20.7%
3 196
18.3%
1 157
14.7%
4 139
 
13.0%

PriceCH
Real number (ℝ)

Distinct10
Distinct (%)0.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.8674206
Minimum1.69
Maximum2.09
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size8.5 KiB
2022-11-24T19:54:10.489640image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Quantile statistics

Minimum1.69
5-th percentile1.69
Q11.79
median1.86
Q31.99
95-th percentile1.99
Maximum2.09
Range0.4
Interquartile range (IQR)0.2

Descriptive statistics

Standard deviation0.10196975
Coefficient of variation (CV)0.0546046
Kurtosis-0.77513258
Mean1.8674206
Median Absolute Deviation (MAD)0.1
Skewness0.064415883
Sum1998.14
Variance0.01039783
MonotonicityNot monotonic
2022-11-24T19:54:10.624604image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
1.86 376
35.1%
1.99 259
24.2%
1.79 101
 
9.4%
1.75 95
 
8.9%
1.69 94
 
8.8%
1.89 42
 
3.9%
1.76 40
 
3.7%
1.96 29
 
2.7%
2.09 27
 
2.5%
2.06 7
 
0.7%
ValueCountFrequency (%)
1.69 94
 
8.8%
1.75 95
 
8.9%
1.76 40
 
3.7%
1.79 101
 
9.4%
1.86 376
35.1%
1.89 42
 
3.9%
1.96 29
 
2.7%
1.99 259
24.2%
2.06 7
 
0.7%
2.09 27
 
2.5%
ValueCountFrequency (%)
2.09 27
 
2.5%
2.06 7
 
0.7%
1.99 259
24.2%
1.96 29
 
2.7%
1.89 42
 
3.9%
1.86 376
35.1%
1.79 101
 
9.4%
1.76 40
 
3.7%
1.75 95
 
8.9%
1.69 94
 
8.8%

PriceMM
Real number (ℝ)

Distinct8
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.0854112
Minimum1.69
Maximum2.29
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size8.5 KiB
2022-11-24T19:54:10.760900image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Quantile statistics

Minimum1.69
5-th percentile1.69
Q11.99
median2.09
Q32.18
95-th percentile2.23
Maximum2.29
Range0.6
Interquartile range (IQR)0.19

Descriptive statistics

Standard deviation0.13438551
Coefficient of variation (CV)0.064440774
Kurtosis2.1839215
Mean2.0854112
Median Absolute Deviation (MAD)0.09
Skewness-1.4618508
Sum2231.39
Variance0.018059466
MonotonicityNot monotonic
2022-11-24T19:54:10.870607image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=8)
ValueCountFrequency (%)
2.09 273
25.5%
2.18 201
18.8%
2.13 197
18.4%
1.99 179
16.7%
2.23 88
 
8.2%
1.69 57
 
5.3%
2.29 40
 
3.7%
1.79 35
 
3.3%
ValueCountFrequency (%)
1.69 57
 
5.3%
1.79 35
 
3.3%
1.99 179
16.7%
2.09 273
25.5%
2.13 197
18.4%
2.18 201
18.8%
2.23 88
 
8.2%
2.29 40
 
3.7%
ValueCountFrequency (%)
2.29 40
 
3.7%
2.23 88
 
8.2%
2.18 201
18.8%
2.13 197
18.4%
2.09 273
25.5%
1.99 179
16.7%
1.79 35
 
3.3%
1.69 57
 
5.3%

DiscCH
Real number (ℝ)

HIGH CORRELATION
ZEROS

Distinct12
Distinct (%)1.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.051859813
Minimum0
Maximum0.5
Zeros838
Zeros (%)78.3%
Negative0
Negative (%)0.0%
Memory size8.5 KiB
2022-11-24T19:54:10.997763image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0.37
Maximum0.5
Range0.5
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.11747419
Coefficient of variation (CV)2.2652259
Kurtosis4.9336362
Mean0.051859813
Median Absolute Deviation (MAD)0
Skewness2.4163405
Sum55.49
Variance0.013800186
MonotonicityNot monotonic
2022-11-24T19:54:11.128188image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%)
0 838
78.3%
0.1 74
 
6.9%
0.37 36
 
3.4%
0.27 29
 
2.7%
0.47 21
 
2.0%
0.2 21
 
2.0%
0.13 21
 
2.0%
0.5 12
 
1.1%
0.3 6
 
0.6%
0.24 5
 
0.5%
Other values (2) 7
 
0.7%
ValueCountFrequency (%)
0 838
78.3%
0.1 74
 
6.9%
0.13 21
 
2.0%
0.16 5
 
0.5%
0.17 2
 
0.2%
0.2 21
 
2.0%
0.24 5
 
0.5%
0.27 29
 
2.7%
0.3 6
 
0.6%
0.37 36
 
3.4%
ValueCountFrequency (%)
0.5 12
 
1.1%
0.47 21
2.0%
0.37 36
3.4%
0.3 6
 
0.6%
0.27 29
2.7%
0.24 5
 
0.5%
0.2 21
2.0%
0.17 2
 
0.2%
0.16 5
 
0.5%
0.13 21
2.0%

DiscMM
Real number (ℝ)

HIGH CORRELATION
ZEROS

Distinct12
Distinct (%)1.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.12336449
Minimum0
Maximum0.8
Zeros746
Zeros (%)69.7%
Negative0
Negative (%)0.0%
Memory size8.5 KiB
2022-11-24T19:54:11.263826image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30.23
95-th percentile0.54
Maximum0.8
Range0.8
Interquartile range (IQR)0.23

Descriptive statistics

Standard deviation0.21383375
Coefficient of variation (CV)1.7333493
Kurtosis1.4920999
Mean0.12336449
Median Absolute Deviation (MAD)0
Skewness1.5947836
Sum132
Variance0.045724872
MonotonicityNot monotonic
2022-11-24T19:54:11.394777image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%)
0 746
69.7%
0.4 132
 
12.3%
0.54 42
 
3.9%
0.2 32
 
3.0%
0.3 29
 
2.7%
0.8 26
 
2.4%
0.24 18
 
1.7%
0.06 15
 
1.4%
0.74 10
 
0.9%
0.1 9
 
0.8%
Other values (2) 11
 
1.0%
ValueCountFrequency (%)
0 746
69.7%
0.06 15
 
1.4%
0.1 9
 
0.8%
0.2 32
 
3.0%
0.24 18
 
1.7%
0.3 29
 
2.7%
0.4 132
 
12.3%
0.54 42
 
3.9%
0.6 6
 
0.6%
0.7 5
 
0.5%
ValueCountFrequency (%)
0.8 26
 
2.4%
0.74 10
 
0.9%
0.7 5
 
0.5%
0.6 6
 
0.6%
0.54 42
 
3.9%
0.4 132
12.3%
0.3 29
 
2.7%
0.24 18
 
1.7%
0.2 32
 
3.0%
0.1 9
 
0.8%

SpecialCH
Categorical

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size8.5 KiB
0
912 
1
158 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters1070
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 912
85.2%
1 158
 
14.8%

Length

2022-11-24T19:54:11.558472image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2022-11-24T19:54:11.693793image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
0 912
85.2%
1 158
 
14.8%

Most occurring characters

ValueCountFrequency (%)
0 912
85.2%
1 158
 
14.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1070
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 912
85.2%
1 158
 
14.8%

Most occurring scripts

ValueCountFrequency (%)
Common 1070
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 912
85.2%
1 158
 
14.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1070
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 912
85.2%
1 158
 
14.8%

SpecialMM
Categorical

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size8.5 KiB
0
897 
1
173 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters1070
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row1
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 897
83.8%
1 173
 
16.2%

Length

2022-11-24T19:54:11.861928image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2022-11-24T19:54:12.033509image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
0 897
83.8%
1 173
 
16.2%

Most occurring characters

ValueCountFrequency (%)
0 897
83.8%
1 173
 
16.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1070
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 897
83.8%
1 173
 
16.2%

Most occurring scripts

ValueCountFrequency (%)
Common 1070
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 897
83.8%
1 173
 
16.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1070
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 897
83.8%
1 173
 
16.2%

LoyalCH
Real number (ℝ)

Distinct553
Distinct (%)51.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.56578233
Minimum1.1 × 10-5
Maximum0.999947
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size8.5 KiB
2022-11-24T19:54:12.185065image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Quantile statistics

Minimum1.1 × 10-5
5-th percentile0.009007
Q10.32525725
median0.6
Q30.85087275
95-th percentile0.98535265
Maximum0.999947
Range0.999936
Interquartile range (IQR)0.5256155

Descriptive statistics

Standard deviation0.30784253
Coefficient of variation (CV)0.54410064
Kurtosis-1.0598373
Mean0.56578233
Median Absolute Deviation (MAD)0.2616135
Skewness-0.2788888
Sum605.38709
Variance0.094767023
MonotonicityNot monotonic
2022-11-24T19:54:12.370567image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.5 61
 
5.7%
0.6 54
 
5.0%
0.4 50
 
4.7%
0.68 28
 
2.6%
0.32 25
 
2.3%
0.256 16
 
1.5%
0.744 16
 
1.5%
0.7952 15
 
1.4%
0.2048 12
 
1.1%
0.48 12
 
1.1%
Other values (543) 781
73.0%
ValueCountFrequency (%)
1.1 × 10-51
0.1%
1.4 × 10-51
0.1%
1.7 × 10-51
0.1%
2.2 × 10-51
0.1%
2.7 × 10-51
0.1%
3.4 × 10-51
0.1%
4.3 × 10-51
0.1%
5.3 × 10-51
0.1%
6.6 × 10-51
0.1%
8.3 × 10-51
0.1%
ValueCountFrequency (%)
0.999947 1
0.1%
0.999934 1
0.1%
0.999917 1
0.1%
0.999896 1
0.1%
0.99987 1
0.1%
0.999838 1
0.1%
0.999797 1
0.1%
0.999746 1
0.1%
0.999683 1
0.1%
0.999604 1
0.1%

SalePriceMM
Real number (ℝ)

Distinct18
Distinct (%)1.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.9620467
Minimum1.19
Maximum2.29
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size8.5 KiB
2022-11-24T19:54:12.520452image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Quantile statistics

Minimum1.19
5-th percentile1.5305
Q11.69
median2.09
Q32.13
95-th percentile2.23
Maximum2.29
Range1.1
Interquartile range (IQR)0.44

Descriptive statistics

Standard deviation0.25269736
Coefficient of variation (CV)0.12879273
Kurtosis-0.51861837
Mean1.9620467
Median Absolute Deviation (MAD)0.1
Skewness-0.79716439
Sum2099.39
Variance0.063855957
MonotonicityNot monotonic
2022-11-24T19:54:12.648592image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=18)
ValueCountFrequency (%)
2.09 209
19.5%
2.18 144
13.5%
1.69 127
11.9%
2.13 127
11.9%
1.59 101
9.4%
2.23 88
8.2%
1.99 66
 
6.2%
1.89 44
 
4.1%
1.79 44
 
4.1%
2.29 33
 
3.1%
Other values (8) 87
8.1%
ValueCountFrequency (%)
1.19 7
 
0.7%
1.38 19
 
1.8%
1.39 10
 
0.9%
1.48 5
 
0.5%
1.49 13
 
1.2%
1.58 6
 
0.6%
1.59 101
9.4%
1.69 127
11.9%
1.78 12
 
1.1%
1.79 44
 
4.1%
ValueCountFrequency (%)
2.29 33
 
3.1%
2.23 88
8.2%
2.18 144
13.5%
2.13 127
11.9%
2.12 15
 
1.4%
2.09 209
19.5%
1.99 66
 
6.2%
1.89 44
 
4.1%
1.79 44
 
4.1%
1.78 12
 
1.1%

SalePriceCH
Real number (ℝ)

Distinct13
Distinct (%)1.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.8155607
Minimum1.39
Maximum2.09
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size8.5 KiB
2022-11-24T19:54:12.778233image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Quantile statistics

Minimum1.39
5-th percentile1.49
Q11.75
median1.86
Q31.89
95-th percentile1.99
Maximum2.09
Range0.7
Interquartile range (IQR)0.14

Descriptive statistics

Standard deviation0.14338359
Coefficient of variation (CV)0.078974821
Kurtosis0.95978822
Mean1.8155607
Median Absolute Deviation (MAD)0.1
Skewness-0.95265054
Sum1942.65
Variance0.020558853
MonotonicityNot monotonic
2022-11-24T19:54:12.907887image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=13)
ValueCountFrequency (%)
1.86 273
25.5%
1.99 183
17.1%
1.89 101
 
9.4%
1.79 101
 
9.4%
1.75 95
 
8.9%
1.69 90
 
8.4%
1.76 76
 
7.1%
1.49 48
 
4.5%
1.59 34
 
3.2%
1.96 29
 
2.7%
Other values (3) 40
 
3.7%
ValueCountFrequency (%)
1.39 27
 
2.5%
1.49 48
 
4.5%
1.59 34
 
3.2%
1.69 90
 
8.4%
1.75 95
 
8.9%
1.76 76
 
7.1%
1.79 101
 
9.4%
1.86 273
25.5%
1.89 101
 
9.4%
1.96 29
 
2.7%
ValueCountFrequency (%)
2.09 6
 
0.6%
2.06 7
 
0.7%
1.99 183
17.1%
1.96 29
 
2.7%
1.89 101
 
9.4%
1.86 273
25.5%
1.79 101
 
9.4%
1.76 76
 
7.1%
1.75 95
 
8.9%
1.69 90
 
8.4%

PriceDiff
Real number (ℝ)

HIGH CORRELATION
ZEROS

Distinct36
Distinct (%)3.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.14648598
Minimum-0.67
Maximum0.64
Zeros75
Zeros (%)7.0%
Negative239
Negative (%)22.3%
Memory size8.5 KiB
2022-11-24T19:54:13.064469image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Quantile statistics

Minimum-0.67
5-th percentile-0.4
Q10
median0.23
Q30.32
95-th percentile0.54
Maximum0.64
Range1.31
Interquartile range (IQR)0.32

Descriptive statistics

Standard deviation0.27156269
Coefficient of variation (CV)1.8538476
Kurtosis0.46593925
Mean0.14648598
Median Absolute Deviation (MAD)0.1
Skewness-0.75977786
Sum156.74
Variance0.073746293
MonotonicityNot monotonic
2022-11-24T19:54:13.239998image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=36)
ValueCountFrequency (%)
0.3 112
 
10.5%
0.32 111
 
10.4%
0.2 96
 
9.0%
0.24 94
 
8.8%
0 75
 
7.0%
0.64 48
 
4.5%
-0.16 46
 
4.3%
0.03 40
 
3.7%
0.33 37
 
3.5%
-0.2 37
 
3.5%
Other values (26) 374
35.0%
ValueCountFrequency (%)
-0.67 7
 
0.7%
-0.58 19
1.8%
-0.57 10
 
0.9%
-0.4 24
2.2%
-0.38 5
 
0.5%
-0.3 22
2.1%
-0.28 6
 
0.6%
-0.27 3
 
0.3%
-0.2 37
3.5%
-0.17 19
1.8%
ValueCountFrequency (%)
0.64 48
4.5%
0.54 29
 
2.7%
0.44 24
 
2.2%
0.42 20
 
1.9%
0.4 2
 
0.2%
0.38 5
 
0.5%
0.33 37
 
3.5%
0.32 111
10.4%
0.3 112
10.5%
0.27 36
 
3.4%

Store7
Boolean

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size1.2 KiB
False
714 
True
356 
ValueCountFrequency (%)
False 714
66.7%
True 356
33.3%
2022-11-24T19:54:13.391295image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

PctDiscMM
Real number (ℝ)

HIGH CORRELATION
ZEROS

Distinct18
Distinct (%)1.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.059298436
Minimum0
Maximum0.40201
Zeros746
Zeros (%)69.7%
Negative0
Negative (%)0.0%
Memory size8.5 KiB
2022-11-24T19:54:13.493072image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30.112676
95-th percentile0.253521
Maximum0.40201
Range0.40201
Interquartile range (IQR)0.112676

Descriptive statistics

Standard deviation0.10176004
Coefficient of variation (CV)1.7160662
Kurtosis1.2725805
Mean0.059298436
Median Absolute Deviation (MAD)0
Skewness1.5394911
Sum63.449326
Variance0.010355106
MonotonicityNot monotonic
2022-11-24T19:54:13.626057image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=18)
ValueCountFrequency (%)
0 746
69.7%
0.201005 59
 
5.5%
0.191388 54
 
5.0%
0.253521 42
 
3.9%
0.150754 29
 
2.7%
0.366972 19
 
1.8%
0.112676 18
 
1.7%
0.027523 15
 
1.4%
0.118343 13
 
1.2%
0.183486 12
 
1.1%
Other values (8) 63
 
5.9%
ValueCountFrequency (%)
0 746
69.7%
0.027523 15
 
1.4%
0.050251 9
 
0.8%
0.095694 10
 
0.9%
0.100503 9
 
0.8%
0.112676 18
 
1.7%
0.118343 13
 
1.2%
0.150754 29
 
2.7%
0.174672 7
 
0.7%
0.183486 12
 
1.1%
ValueCountFrequency (%)
0.40201 7
 
0.7%
0.366972 19
 
1.8%
0.347418 10
 
0.9%
0.321101 5
 
0.5%
0.275229 6
 
0.6%
0.253521 42
3.9%
0.201005 59
5.5%
0.191388 54
5.0%
0.183486 12
 
1.1%
0.174672 7
 
0.7%

PctDiscCH
Real number (ℝ)

HIGH CORRELATION
ZEROS

Distinct13
Distinct (%)1.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.027313837
Minimum0
Maximum0.252688
Zeros838
Zeros (%)78.3%
Negative0
Negative (%)0.0%
Memory size8.5 KiB
2022-11-24T19:54:13.760416image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0.198925
Maximum0.252688
Range0.252688
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.062232404
Coefficient of variation (CV)2.2784204
Kurtosis4.8776958
Mean0.027313837
Median Absolute Deviation (MAD)0
Skewness2.4223001
Sum29.225806
Variance0.0038728721
MonotonicityNot monotonic
2022-11-24T19:54:13.895055image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=13)
ValueCountFrequency (%)
0 838
78.3%
0.050251 59
 
5.5%
0.198925 36
 
3.4%
0.145161 29
 
2.7%
0.252688 21
 
2.0%
0.095694 21
 
2.0%
0.068783 21
 
2.0%
0.053763 15
 
1.4%
0.251256 12
 
1.1%
0.177515 6
 
0.6%
Other values (3) 12
 
1.1%
ValueCountFrequency (%)
0 838
78.3%
0.050251 59
 
5.5%
0.053763 15
 
1.4%
0.068783 21
 
2.0%
0.091398 2
 
0.2%
0.091429 5
 
0.5%
0.095694 21
 
2.0%
0.120603 5
 
0.5%
0.145161 29
 
2.7%
0.177515 6
 
0.6%
ValueCountFrequency (%)
0.252688 21
2.0%
0.251256 12
 
1.1%
0.198925 36
3.4%
0.177515 6
 
0.6%
0.145161 29
2.7%
0.120603 5
 
0.5%
0.095694 21
2.0%
0.091429 5
 
0.5%
0.091398 2
 
0.2%
0.068783 21
2.0%

ListPriceDiff
Real number (ℝ)

HIGH CORRELATION
ZEROS

Distinct18
Distinct (%)1.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.21799065
Minimum0
Maximum0.44
Zeros119
Zeros (%)11.1%
Negative0
Negative (%)0.0%
Memory size8.5 KiB
2022-11-24T19:54:14.036723image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10.14
median0.24
Q30.3
95-th percentile0.32
Maximum0.44
Range0.44
Interquartile range (IQR)0.16

Descriptive statistics

Standard deviation0.10753545
Coefficient of variation (CV)0.49330302
Kurtosis-0.22899269
Mean0.21799065
Median Absolute Deviation (MAD)0.06
Skewness-0.64520812
Sum233.25
Variance0.011563873
MonotonicityNot monotonic
2022-11-24T19:54:14.163276image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=18)
ValueCountFrequency (%)
0.24 159
14.9%
0.32 149
13.9%
0.27 143
13.4%
0 119
11.1%
0.3 119
11.1%
0.1 105
9.8%
0.23 75
7.0%
0.14 37
 
3.5%
0.2 32
 
3.0%
0.13 28
 
2.6%
Other values (8) 104
9.7%
ValueCountFrequency (%)
0 119
11.1%
0.07 7
 
0.7%
0.1 105
9.8%
0.13 28
 
2.6%
0.14 37
 
3.5%
0.17 10
 
0.9%
0.19 13
 
1.2%
0.2 32
 
3.0%
0.22 19
 
1.8%
0.23 75
7.0%
ValueCountFrequency (%)
0.44 24
 
2.2%
0.42 10
 
0.9%
0.33 11
 
1.0%
0.32 149
13.9%
0.3 119
11.1%
0.29 10
 
0.9%
0.27 143
13.4%
0.24 159
14.9%
0.23 75
7.0%
0.22 19
 
1.8%

STORE
Categorical

Distinct5
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size8.5 KiB
0
356 
2
222 
3
196 
1
157 
4
139 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters1070
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row0

Common Values

ValueCountFrequency (%)
0 356
33.3%
2 222
20.7%
3 196
18.3%
1 157
14.7%
4 139
 
13.0%

Length

2022-11-24T19:54:14.297915image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2022-11-24T19:54:14.440591image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
0 356
33.3%
2 222
20.7%
3 196
18.3%
1 157
14.7%
4 139
 
13.0%

Most occurring characters

ValueCountFrequency (%)
0 356
33.3%
2 222
20.7%
3 196
18.3%
1 157
14.7%
4 139
 
13.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1070
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 356
33.3%
2 222
20.7%
3 196
18.3%
1 157
14.7%
4 139
 
13.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1070
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 356
33.3%
2 222
20.7%
3 196
18.3%
1 157
14.7%
4 139
 
13.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1070
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 356
33.3%
2 222
20.7%
3 196
18.3%
1 157
14.7%
4 139
 
13.0%

Interactions

2022-11-24T19:54:06.536443image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:53:41.787030image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:53:43.616053image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:53:46.051591image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:53:48.088047image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:53:49.929412image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:53:52.010638image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:53:54.237398image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:53:56.091434image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:53:58.122354image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:54:00.196320image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:54:02.373495image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:54:04.382689image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:54:06.660974image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:53:41.939773image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:53:43.748710image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:53:46.193225image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:53:48.220688image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:53:50.206672image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:53:52.168215image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:53:54.371042image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:53:56.233056image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:53:58.266927image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:54:00.366863image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:54:02.518606image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:54:04.514202image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:54:06.809578image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:53:42.070424image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:53:43.886373image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:53:46.337776image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:53:48.363307image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:53:50.350287image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:53:52.321807image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:53:54.505360image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:53:56.375719image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:53:58.415143image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:54:00.543390image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:54:02.660515image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:54:04.648844image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:54:06.965523image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:53:42.225055image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:53:44.088790image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:53:46.539390image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:53:48.512546image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:53:50.507866image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:53:52.481530image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:53:54.649973image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:53:56.534963image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:53:58.574432image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:54:00.704957image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:54:02.831100image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:54:04.802616image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:54:07.114170image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:53:42.360741image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:53:44.276288image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:53:46.683119image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:53:48.648904image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:53:50.683396image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:53:52.647214image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:53:54.794430image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:53:56.674543image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:53:58.718317image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:54:00.869519image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:54:03.013572image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:54:04.959196image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:54:07.249805image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:53:42.495195image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:53:44.618373image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:53:46.840698image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:53:48.783503image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:53:50.840976image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:53:52.862646image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:53:54.938047image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:53:56.977735image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:53:58.860937image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:54:01.058013image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:54:03.165166image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:54:05.138757image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:54:07.403352image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:53:42.636625image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:53:44.814850image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:53:47.016184image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:53:48.947066image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:53:51.000548image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:53:53.127931image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:53:55.096630image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:53:57.138305image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:53:59.029486image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:54:01.262467image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:54:03.329726image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:54:05.523130image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:54:07.530342image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:53:42.771265image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:53:44.963451image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:53:47.161836image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:53:49.083739image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:53:51.140175image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:53:53.311448image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:53:55.237254image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:53:57.277933image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:53:59.179086image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:54:01.437997image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:54:03.473341image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:54:05.664665image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:54:07.663214image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:53:42.904907image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:53:45.181908image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:53:47.319371image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:53:49.221031image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:53:51.280472image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:53:53.464101image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:53:55.372931image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:53:57.409615image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:53:59.332677image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:54:01.616520image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:54:03.614012image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:54:05.801254image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:54:07.810779image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:53:43.057502image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:53:45.399295image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:53:47.481951image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:53:49.371628image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:53:51.426082image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:53:53.618593image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:53:55.511566image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:53:57.550345image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:53:59.504262image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:54:01.778088image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:54:03.766302image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:54:05.964818image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:54:07.957389image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:53:43.206103image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:53:45.593765image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:53:47.632260image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:53:49.509071image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:53:51.572593image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:53:53.767613image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:53:55.651454image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:53:57.688506image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:53:59.645291image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:54:01.925693image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:54:03.922881image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:54:06.116412image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:54:08.111973image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:53:43.355741image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:53:45.779273image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:53:47.788838image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:53:49.659058image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:53:51.728508image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:53:53.936160image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:53:55.800058image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:53:57.839104image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:53:59.802373image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:54:02.089255image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:54:04.088481image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:54:06.275985image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:54:08.259575image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:53:43.491092image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:53:45.923881image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:53:47.940438image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:53:49.792778image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:53:51.867140image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:53:54.091788image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:53:55.952026image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:53:57.980692image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:53:59.997848image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:54:02.235864image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:54:04.242072image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-24T19:54:06.409627image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Correlations

2022-11-24T19:54:14.852937image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Auto

The auto setting is an interpretable pairwise column metric of the following mapping:
  • Variable_type-Variable_type : Method, Range
  • Categorical-Categorical : Cramer's V, [0,1]
  • Numerical-Categorical : Cramer's V, [0,1] (using a discretized numerical column)
  • Numerical-Numerical : Spearman's ρ, [-1,1]
The number of bins used in the discretization for the Numerical-Categorical column pair can be changed using config.correlations["auto"].n_bins. The number of bins affects the granularity of the association you wish to measure.

This configuration uses the recommended metric for each pair of columns.
2022-11-24T19:54:15.155127image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-11-24T19:54:15.430026image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-11-24T19:54:15.695018image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-11-24T19:54:16.011329image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.
2022-11-24T19:54:16.265649image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2022-11-24T19:54:08.491001image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
A simple visualization of nullity by column.
2022-11-24T19:54:08.951333image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

IdPurchaseWeekofPurchaseStoreIDPriceCHPriceMMDiscCHDiscMMSpecialCHSpecialMMLoyalCHSalePriceMMSalePriceCHPriceDiffStore7PctDiscMMPctDiscCHListPriceDiffSTORE
01CH23711.751.990.000.0000.5000001.991.750.24No0.0000000.0000000.241
12CH23911.751.990.000.3010.6000001.691.75-0.06No0.1507540.0000000.241
23CH24511.862.090.170.0000.6800002.091.690.40No0.0000000.0913980.231
34MM22711.691.690.000.0000.4000001.691.690.00No0.0000000.0000000.001
45CH22871.691.690.000.0000.9565351.691.690.00Yes0.0000000.0000000.000
56CH23071.691.990.000.0010.9652281.991.690.30Yes0.0000000.0000000.300
67CH23271.691.990.000.4110.9721821.591.69-0.10Yes0.2010050.0000000.300
78CH23471.751.990.000.4100.9777461.591.75-0.16Yes0.2010050.0000000.240
89CH23571.751.990.000.4000.9821971.591.75-0.16Yes0.2010050.0000000.240
910CH23871.751.990.000.4000.9857571.591.75-0.16Yes0.2010050.0000000.240
IdPurchaseWeekofPurchaseStoreIDPriceCHPriceMMDiscCHDiscMMSpecialCHSpecialMMLoyalCHSalePriceMMSalePriceCHPriceDiffStore7PctDiscMMPctDiscCHListPriceDiffSTORE
10601061MM23611.751.990.00.00000.6952581.991.750.24No0.0000000.0000000.241
10611062MM24211.861.990.00.30010.5562061.691.86-0.17No0.1507540.0000000.131
10621063MM24571.862.090.00.20000.4449651.891.860.03Yes0.0956940.0000000.230
10631064CH25111.762.090.00.00000.3559722.091.760.33No0.0000000.0000000.331
10641065CH25171.862.090.10.00000.4847782.091.760.33Yes0.0000000.0537630.230
10651066CH25271.862.090.10.00000.5878222.091.760.33Yes0.0000000.0537630.230
10661067CH25671.862.180.00.00000.6702582.181.860.32Yes0.0000000.0000000.320
10671068MM25771.862.180.00.00000.7362062.181.860.32Yes0.0000000.0000000.320
10681069CH26171.862.130.00.24000.5889651.891.860.03Yes0.1126760.0000000.270
10691070CH27011.862.180.00.00000.6711722.181.860.32No0.0000000.0000000.321